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Abstract (Article Summary) 

Total Quality Management (TQM) was implemented in 1989 at US West Technologies. In retrospect, almost 
everything was done incorrectly. Ultimately, however, US West learned how to adapt and focus TQM to deliver 
quantum improvements in speed, quality, and cost. Quantum improvement delivers 50% or greater reductions in 
cycle time, defects, or cost. The 5 biggest mistakes made were: 1. focusing on learning instead of results, 2. lack of 
focus, 3. lack of sponsorship, 4. trying to involve everyone, and 5. teaching theory instead of developing real-world 
experience. The eventual, successful implementation of TQM at US West is described. Quantum improvements 
require focused planning, problem solving, and follow-through. The following process, applying the Shewhart Cycle, 
works well to drive quantum improvements in overall software quality: 1. Plan - focus the improvement effort. 2. Do 
- focus on immediate results. 3. Check. 4. Act. 



Subjects: 

Classification Codes 
Locations: 
Companies: 
Author(s): 
Publication title: 

Source Type: 
ISSN/ISBN: 

ProQuest document ID: 
Text Word Count 
Article URL: 



Full Text (3448 words) 

Copyright Association for Computing Machinery Jun 1997 



In 1989, U S WEST Technologies began implementing Total Quality Management (TQM). In the years that 
followed, TQM spread throughout the company. In retrospect, we did almost everything wrong. Ultimately we 
learned how to adapt and focus TQM to deliver not just incremental improvements, but quantum improvements in 
speed, quality, and cost. Quantum improvement delivers 50% or greater reductions in cycle time, defects, or cost. 

This is our story. 

The Technologies group began with goals in terms of number of people trained and number of teams started. 
These goals drove our behavior. More than 4,000 people were trained. More than 100 teams were started. Each 
team brainstormed problems-quality initiatives-and set about solving them. Each team met for one hour per week. 
Months passed. Years passed. A few teams were able to identify root causes and implement meaningful solutions, 
but most eventually gave up. 

I studied the skills and abilities of successful and unsuccessful improvement teams. Let's look at the biggest 
mistakes the unsuccessful ones made and how the successful ones continue to be different. 
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The Five Biggest Mistakes 

Mistake #1 . Focusing on learning instead of results. Remember, we set goals for the number of people trained and 
the number of teams started. Wrong goals! Measure success by reductions in defects, cycle time, and the costs of 
waste and rework. Deming's immortal cry was for "profound knowledge." As a result, most companies focused on 
learning the tools and techniques rather than achieving usable results. Deming was wrong. Initially, people need 
only an essential understanding of the process and tools required to achieve results. 

Americans have a results-oriented, immediate gratification culture. If we cannot immediately see how something 
will improve our lives, we discard it. The improvement processes and tools must be experienced in such a way that 
participants immediately grasp the power of using improvement tools and achieving a useful result. Otherwise, they 
quickly abandon it. Once they succeed with essential knowledge, then they are ready to explore the profound 
knowledgethe depth of knowledge-available. 

Mistake #2. Lack of focus. We allowed teams to "brainstorm" problem areas. Some teams picked problems over 
which they had no direct control. They tried to fix other groups' problems. Some teams fell into the quality of 
worklife trap and spent months trying to decide how to rearrange their work space. Brainstorming broadened and 
blurred the focus. 

To achieve quantum improvements, you must focus the improvement effort by identifying the few contributors to the 
cost that are responsible for the bulk of the problem [7]. This Pareto principle says that 20% of your effort delivers 
80% of the result, but this still isn't focused enough to produce quantum improvements. From working with many 
teams and companies, it has become apparent that as little as 4% of business systems deliver over 60% of the 
waste and rework. Frederick Brooks, for example, said that 4% of IBM's OS 360 code contained 64% of the bugs 
[4]. The secret to getting quantum (50% or greater) improvements instead of just incremental (10% or less) 
improvements is achievable by focusing improvement efforts even more tightly. 

Having teams meet for only an hour a week also showed a lack of focus. If a problem is worth addressing, it's worth 
solving now! Most hourly meetings took 20 minutes of start up to remember where we were, 20 minutes of real 
work, and 20 minutes of shut down. This was a tremendously inefficient use of time. 

Longer or more focused problem-solving meetings seem to be optimal. Focusing on results and direct application of 
improvement techniques to immediate problems creates a sense of urgency and immediacy that says to the 
employee that something has to happen now. 

Mistake #3. Lack of sponsorship. Initial problems were selected in team brainstorming sessions. This often led to 
selecting problems that plagued the team, not their customers. Letting teams pick problems also failed to create 
leadership support for the proposed solutions. Leadership wanted results, but they only saw teams meeting week 
after week with no visible result. 

Leaders also had problems brainstorming problems. They often picked the effects they wanted rather than 
problems to be solved. They wanted to "fix morale" or "improve customer's perceptions of the department." Poor 
morale or negative perceptions are the effects of real problems, like autocratic leadership, product delays, and 
defects. Customer perceptions improve only as a by-product of being better, faster, and cheaper than your 
competition. It took over two years, at one to two hours per week, to work through the maze of customer 
perceptions, but by then, the leader had moved on and there was no support for our proposed solutions. 

Mistake #4. Trying to involve everyone, not just the people focused on key results. Gerald Weinberg said: "The 
wider you spread it, the thinner it gets" [12]. The cost of a five-day quality team leader course given by two 
consultants was over $20,000. Because we wanted to train everyone, we had to reduce these costs. We did so by 
developing internal trainers. But no matter how well they were trained, they had no actual experience using TQM to 
achieve beneficial results. The training failed to impart the beliefs, values, skills, and abilities necessary to be 
successful at quality improvement. 

To accelerate the speed of implementation, companies need to hire skilled practitioners to lead the improvement in 
the key problem areas. The team members of these initial teams then become the first wave of internal team 
leaders. 
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Mistake #5. Teaching theory instead of developing realworld experience. When I first learned the quality problem- 
solving process, I spent five days in what turned out to be an excellent class. At the end, I knew that I'd received 
some of the best training I'd ever had. I was so impressed that a few months later, I became an instructor. When I 
taught my first class, I hadn't actually used the process to solve a business problem. I felt like a fraud. I thought my 
students would catch on that I hadn't actually solved a problem, but they didn't. 




Figure I. Pareto chart: Causes of client-server outages 



As instructors, we followed Deming's slow, methodical instructional process he probably learned at the Colorado 
School of Mines in 1922: 



Explain the theory (profound knowledge) 
Show a simple example 

Have the participants do a practice problem from a mythical case study. 

This approach is mainly for the left-brain, conscious mind, and it is slow. Most students simply cannot generalize 
the mythical case study to fit their own work environment. The best way to be successful at teaching quality 
improvement is to develop improvement skills as a by-product of actually improving something vital to the business. 

Hopefully, this has illuminated some of the tarpits on the road to software system quality. Now, let's look at what 
works. 



A Quantum Improvement Story 

U S WEST has a large, 12-site client-server operation used by more than 8,000 service representatives to respond 
to customer requests. Over 200 support personnel sustain the daily operating environment. At the beginning of 
1995, the average "person minutes of outage" (the number of minutes service representatives were unable to 
access their systems) stood at over 100,000 minutes per week-1 ,666 hours, 208 days, or 42 weeks of outage per 
week. 

The V.P. of Communications and Information Services (CIS handles most of the computer operations), set a goal of 
reducing the seat minutes of outage by 50% by end of calendar year 1 995. He commissioned a core team of three 
people to oversee the improvement effort. Technicians at each site investigated outages, analyzed root causes, 
implemented solutions, and wrote trouble tickets for each outage, no matter how small. The core team used these 
trouble tickets to develop a master Ql story-a story that links all of the problem-solving efforts to achieve the desired 
quantum improvement. Trouble tickets were categorized and analyzed to find key areas for improvement. As the 
Pareto Chart in Figure 1 shows, server software and hardware accounted for 66% of all outages. Application 
software accounted for an additional 28% of the outages. 

The core team then initiated subteams to analyze the root causes and propose long-term countermeasures in each 
key problem area. Each team met for as long as necessary (usually a day or less) and then disbanded. Root 
causes included file corruption, process errors, and various hardware element failures. Proposed solutions included 
operating system upgrades and application software changes to prevent file errors, process improvements, and 
hardware maintenance. The core team project managed the ongoing implementation of long-term countermeasures 
and measurement of results. Some of the projects included upgrading all of the servers to the next release of Unix, 
replacing unsupported vendor hardware, and physically moving servers to prevent outages. 
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By mid-year, 1995 seat minutes of outage had fallen to below 30,000 minutes-a 74% reduction. By end-of-year it 
was down to 20,566 per week-a 79% reduction (see comparison in Figure 2). Overall, this improvement effort 
added more than four million minutes of system availability to their client-server environment-69,000 hours or over 
8,600 days. Total system availability rose to 99.85%. This team won one of only eight Gold Chairman's Awards for 
quality improvement. In 1996, they used the same strategy to halve their outages again! They are now below 
10,000 minutes/week. 



As this example demonstrates, quantum improvements are possible. It requires focused planning, problem solving, 
and follow-through. Based on this success, CIS is using a customized version of the Ql Coloring Books to expand 
their improvement efforts and renew their improvement skills in 1996.1 



A Billing Improvement Story 




Enlarge 200% 
Enlarge 400% 



Figure 2. Comparison charts showing results of countermeasures 



A similar approach was used in the billing environment. In a three-day session, the leadership team developed a 
master Ql story that identified the key problem areas: service order errors, message (long distance) errors, bill 
cycle time, cost of billing adjustments, bill format, postage costs, and new product implementation time. The 
leadership team also identified leaders and members for each team. Root cause teams met for as long as three 
days to identify and verify root causes and to propose countermeasures. One member of the leadership team 
coordinated the implementation strategy. 



Dramatic improvements occurred across the board. Manual adjustments costing $25 apiece were automated, 
reducing their cost to $3. Postage costs, which had doubled to $60 million, were combined with a radically 
redesigned bill format to reduce postage costs by $20 million-$30 million. Billing cycle time was reduced by one to 
two days, resulting in improved cash flow. Service order errors decreased by over 50%. Message errors, which had 
been processed in a FIFO-first in, first out-order, began processing to maximize the accuracy of the next day's 
billing cycle. 



These master Ql stories also facilitated the initial step of billing reengineering. Data gathered by all teams 
supported the cost of integrating the three main billing systems into one system that leverages the best of all three. 
Root causes identified by ail teams created requirements for the new system. When completed, this project will 
dramatically reduce the cost of billing, increase billing accuracy, and deliver a significantly improved bill to 
customers. 



How to Succeed at Improving Software Quality 



In working with U S WEST, it is clear there is a way to drive improvements in speed, quality, and cost two-fold to 
ten-fold using TQM. First, begin seeing software as a system composed of people, organizations, processes, 
application and operating system software, networks, computers, terminals, training materials, and so on. Look at 
the total system. The application software may be a trivial part of overall software quality. 



There is a clear need for achieving results. The following process, applying the Shewhart Cycle, works well to drive 
quantum improvements in overall software quality: 



Plan: Focus the improvement effort. In a three-day session, skilled improvement facilitators help leadership develop 
a master Ql story that identifies the key problems to be solved. Where are customers experiencing the most pain? 
It is always in one of three areas: reducing defects (system outages), reducing cycle time, or reducing the cost of 
waste and rework resulting from the defects. Leaders further stratify the overall problem into smaller slices that are 
easily addressed by a team. Multiple teams will be required to address all of the issues. Leadership also picks the 
team members who have the skills to address the problem or issue. Having the leaders do this creates the 
sponsorship and focuses the improvement effort. 



http7/proquest.umixom/pqdweb?index=8&sid=6&srchmode=l&vinst=PROD& 4/18/04 



Article View 



Page 5 of 7 



Do: Focus on immediate results. In two-day sessions, skilled improvement facilitators help team members analyze 
the root causes and develop countermeasures (proposed solutions) to prevent or eliminate the problem. Team 
members return to their work. The leadership team manages the project of implementing the countermeasures and 
measuring the results. 

If the team cannot identify root causes and countermeasures in a two-day session, then either the wrong people 
are in the room or the problem, as stated, cannot be resolved using this method. Leadership will need to review the 
problem definition and decide on next steps, if any. 

Use accelerated methods to teach the essential methods and tools required to accomplish the team's objective. 
These methods can develop improvement skills in as little as two hours by repeated demonstration. Humans are 
masterful at detecting patterns of behavior and then modeling and repeating them. Three seems to be the minimum 
number of repetitions required to create an initial level of competence. The following process demonstrates the 
improvement strategy: 

Tell the participants a story of how these tools and processes were used to resolve a similar issue. Demonstrate 
how you have used the improvement processes to solve a real problem or manage an existing process. 

Demonstrate how to use the process on two or three issues selected by the members. Make sure they have a 
sense of how the process works (repetition 2-3). 

Discuss what they've learned and how it applies to the problem at hand. 

Check: Leadership manages the implementation and monitors the results of the proposed solutions. 

Act: Dramatic improvements need to be stabilized and sustained. Leadership monitors and manages the improved 
processes to ensure continued high performance. Leadership may also refocus existing improvement efforts as 
needed, set new targets for improvement, and continue the never-ending process of reducing defects, time, and 
cost. 

Quantum improvements in software quality, cycle time, and cost are readily available by applying this process to 
the entire system. When properly applied, it will maximize your results, minimize the resources required, rapidly 
develop internal expertise, and create sustainable competitive advantage. 

The Natural Laws of Software Quality 

There are also ways to create and evolve software to ensure quality. Because you can grow software four times 
faster than you can build it [10], there are some natural laws that will help ensure software quality. The waterfall 
model of software development is like building. Software can be grown using rapid application developmentstart 
small, expand, and adjust course until a working system is delivered. 

Pareto's Rule holds true, especially in software systems: 

20% of the code has 80% of the defects [1]Find them! Fix them! Remember Fred Brooks: 4% of OS/360 had over 
60% of the bugs. Similarly, I worked on a reusable software library; two of the first 11 modules (20%) had all of the 
bugs. Dig into the trouble log for any system and you will discover that a small percentage of the system causes 
most of the defects and outages. Have your best programmer revise the errant code. 

20% of the code requires 80% of the enhancements-Find them by looking into enhancement logs to find out where 
most changes occur. Externalize the changes in data or decision tables. Modularize the existing system. In one 
large message processing program, these two changes reduced the number of people required from five 
experienced programmers to one-half of a junior programmer's time. Cycle time for changes was cut from months 
to days. 

Know the limits: The human mind can tolerate at most 7 + 2 bits of information [ 1 1 ]-telephones have seven digits, 
social security numbers have nine. Cyclomatic complexity-a measure of the number of decisions in a module, 
subprogram, or object of code [9]-should never exceed nine. After studying the IEEE Transactions on Software 
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Engineering and IEEE Software for over a decade, article after article has shown that if you want zero-defect 
software, keep the number of decisions under 10. One such study of Pascal and Fortran programs [8] found that a 
cyclomatic complexity of 5 to 15 minimized the number of changes. This simple rule is based on the work of 
Genichi Taguchi [7] which says there is an ideal target value for any product dimension (in this case 7+2 decisions). 
As you move away from the target value, there is an increasing cost. Smaller modules (0-4 decisions) cost more 
because there are more of them, while larger modules (greater than nine decisions) cost more to maintain because 
of their complexity. Programs become unmaintainable because their decision complexity exceeds 50 (7 * 7) [6]. 

One software engineer decided to test this theory. His company's software controlled huge machines that cut steel 
parts for naval ships. Limiting modules to nine decisions and a maximum of 300 executable lines of code not only 
cut maintenance costs by 10% but also cut scrap costs by 25% on the huge plates of steel used. The software was 
easier to understand and therefore easier to optimize. 

In the reuse project I worked on, the first modules had a decision complexity of 50. No one used them. As we 
simplified the modules and their complexity approached nine decisions/module, reuse increased dramatically. 

If you want flexible, high quality, reusable software, know these limits. Use them. 

Use a process. Any consistent software development, maintenance, or operational process is better than the so- 
called "creative" process employed by many. A consistent process can be defined using flowcharts, measured 
using defects, time, or cost, and managed via some form of process management. This is especially important for 
daily operations. A defined and managed software process catapults your software process maturity to Level 4 on 
the Software Engineering Institute (SEI) scale. For SEI purists, this is grossly over simplified, but essentially correct. 

Any consistent process can then be optimized (Level 5) using the quantum improvement process and sustained by 
monitoring and managing the process metrics. 

Conclusions 

The power of TQM can be harnessed to deliver quantum improvements in the speed, quality, and cost of software 
creation, evolution, and operation. It requires: 

Attention to the entire software systempeople, process, machines, materials, application software, operating system 
software, and environment 

Focus on results-reducing defects, cycle time, waste, and rework 

Heroic goals (50% or greater reductions) 

Sponsorship, support, and guidance from the leadership team 

Skilled improvement leaders-consultants at first and experienced internal leaders thereafter 

A rapid, outcome-oriented approach to problem solving that delivers maximum results with minimum resources 

Continuous attention to stabilizing, sustaining, and replicating the resulting improvements. 

Permission to make digital/hard copy of part or all of this work for personal or classroom use is granted without fee 
provided that copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of 
the publication and its date appear, and notice is given that copying is by permission of ACM, Inc. To copy 
otherwise, to republish, to post on servers, or to redistribute to lists requires prior specific permission and/or a fee. 

[Footnote] 

1The Ql Coloring Book is a 40-page teaching aid that combines a parable and a workbook. The parable teaches the Ql 
process by having Robin Hood help the King and Queen solve their quality problems. The workbook mirrors the parable. 

[Reference] 
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[Headnote] 

Distributed data processing is becoming a reality. Businesses want to do it for many reasons, and they often must do it in 
order to stay competitive. While much of the infrastructure for distributed data processing is already there (e.g., modern 
network technology), a number of issues make distributed data processing still a complex undertaking: (1) distributed 
systems can become very large, involving thousands of heterogeneous sites including PCs and mainframe server 
machines; (2) the state of a distributed system changes rapidly because the load of sites varies over time and new sites 
are added to the system; (3) legacy systems need to be integrated-such legacy systems usually have not been designed 
for distributed data processing and now need to interact with other (modern) systems in a distributed environment. 
This paper presents the state of the art of query processing for distributed database and information systems. The paper 
presents the "textbook" architecture for distributed query processing and a series of techniques that are particularly useful 
for distributed database systems. These techniques include special join techniques, techniques to exploit intraquery 
parallelism, techniques to reduce communication costs, and techniques to exploit caching and replication of data. 
Furthermore, the paper discusses different kinds of distributed systems such as client-server, middleware (multitier), and 
heterogeneous database systems, and shows how query processing works in these systems. 

[Headnote] 

Categories and Subject Descriptors: E.5 [Data:]: Files; H.2.4 [Database Management Systems:]: Distributed Databases, 
Query Processing; H.2.5 [Heterogeneous Databases:]: Data Translation 
General Terms: Algorithms, Performance 

Additional Key Words and Phrases: Query optimization, query execution, client-server databases, middleware, multitier 
architectures, database application systems, wrappers, replication, caching, economic models for query processing, 
dissemination-based information systems 

Although there was a clear need and many good ideas and prototypes (e.g., System R* [Williams et al. 1981], 
SDD-1 [Bernstein et al. 1981], and Distributed Ingres [Stonebraker 1985]), the early efforts in building distributed 
database systems were,never commercially successful [Stonebraker 1994]. In some aspects, the early distributed 
database systems were ahead of their time. First, communication technology was not stable enough to ship 
megabytes of data as required for these systems. Second, large businesses somehow managed to survive without 
sophisticated distributed database technology by sending tapes, diskettes, or just paper to exchange data between 
their offices. 

Today, the situation has changed dramatically. Distributed data processing is both feasible and needed. Almost all 
major database system vendors offer products to support distributed data processing (e.g., IBM, Informix, 
Microsoft, Oracle, Sybase), and large database application systems have a distributed architecture (e.g., business 
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application systems such as Baan IV, Oracle Finance, Peoplesoft 7.5, and SAP R/3). Distributed data processing is 
feasible because of recent technological advances (e.g., hardware, software protocols, standards). Distributed data 
processing is needed because of changing business requirements, which have made distributed data processing 
cost-effective and in certain situations the only viable option. Specifically, businesses are beginning to rely on 
distributed rather than centralized databases for the following reasons: 

1. Cost and scalability. Today, one thousand PC processors are cheaper and significantly more powerful than one 
big mainframe computer. So, it makes economic sense to replace a mainframe by a network of small, off-the-shelf 
processors. Furthermore, it is very difficult to "up-size" a mainframe computer if a company grows, while new PCs 
can be added to the network at any time in order to meet a company's new requirements. High availability can be 
achieved by mirroring (replicating) data. 

2. Integration of different software modules. It has become clear that no single software package can meet all the 
requirements of a company. Companies must, therefore, install several different packages, each potentially with its 
own database, and the result is a distributed database system. Even single software packages offered by one 
vendor have a distributed, componentbased architecture so that the vendor can market and offer upgrades for 
every component individually. 

3. Integration of legacy systems. The integration of legacy systems is one particular example that demonstrates 
how some companies are forced to rely on distributed data processing in which their old legacy systems need to 
coexist with new modern systems. 

4. New applications. There are a number of new emerging applications that rely heavily on distributed database 
technology; examples are workflow management, computer-supported collaborative work, tele-conferencing, and 
electronic commerce. 

5. Market forces. Many companies are forced to reorganize their businesses and use state-of-the-art distributed 
information technology in order to remain competitive. As an example, people will probably not eat more Pizza 
because of the Internet, but a Pizza delivery service is definitely going to lose some of its market share if it does not 
allow people to order Pizza on the Web. 

This list shows that there are many different reasons to rely on distributed architectures and correspondingly many 
different kinds of distributed systems exist. Sometimes it is only the software and not the hardware that is 
distributed. The purpose of this paper is to give a comprehensive overview of what query processing techniques 
are needed to implement any kind of distributed database and information system. It is assumed that users and 
application programs issue queries using a declarative query language such as SQL [Melton and Simon 1993] or 
OQL [Cattell et al. 1997] and without knowing where and in which format the data is stored in the distributed 
system. The goal is to execute such queries as efficiently as possible in order to minimize the time that users must 
wait for answers or the time application programs are delayed. To this end, we will discuss a series of techniques 
that are particularly effective to execute queries in today's distributed systems. For example, we will describe the 
design of a query optimizer that compiles a query for execution and determines the best possible way among many 
alternative ways to execute a query. We will also show how techniques such as caching and replication can be 
used to improve the performance of queries in a distributed environment. Furthermore, we will cover specific query 
processing techniques for client-server, middleware (multitier), and heterogeneous database and information 
systems, which represent architectures that are frequently found in practice. 

1.2 Scope of this Paper and Related Surveys 

A very large body of work in the general area of database systems exists. All this work can be roughly classified 
into work on architectures and techniques for transaction processing (i.e., quickly processing small update 
operations), work on query processing (i.e., mostly read operations that explore large amounts of data), and work 
on data models, languages, and user interfaces for advanced applications. In this paper, we will focus primarily on 
query processing. A discussion of transaction processing and of alternative data models is beyond the scope of this 
paper. Transaction processing has been thoroughly investigated in, for example, Gray and Reuter [1993]. Work on 
data models (relational, deductive, object-oriented, and semistructured) is described in Ullman [1988], Cattell et al. 
[1997], Abiteboul [1997], and Buneman [1997]. Also, we will assume that the reader is familiar with basic database 
system concepts, SQL, and the relational data model. Good introductory textbooks are Silberschatz et al. [1997] 
and Ramakrishnan [1997]. 
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This paper will not even be able to give a full coverage of all query processing techniques used today; in particular, 
a number of query processing techniques for the World Wide Web are not discussed. For instance, we will not 
present the architecture of search engines such as AltaVista. Furthermore, there have been several proposals to 
manage Web sites and query a network of Web pages; see Florescu et al. [1998] for a survey. In addition, several 
proposals to manage and query XML data exist (e.g., McHugh and Widom [1999], Abiteboul et al. [1999], and 
Florescu et al. [1999]). Instead of going into the details of all these techniques, the focus of this paper is on 
fundamental mechanisms to process queries that involve data from several sites. We will, therefore, concentrate on 
structured data (such as that found in relational or object-oriented databases) and on query languages for 
structured data (such as SQL or OQL). Nevertheless, the techniques described in this paper are also relevant to 
process other kinds of data in a distributed environment. 

A parallel database system is a particular type of distributed system. Distributed and parallel database systems 
share several properties and goals-in particular, if the parallel system has a so-called "shared-nothing" architecture 
[Stonebraker 1986]. The purpose of a parallel database system is to improve transaction and query response 
times, and the availability of the system for centralized applications. Parallel systems, therefore, emphasize the 
cost/scalability arguments described above, while the distributed systems discussed in this paper often address 
issues such as the heterogeneity of components. While some query processing techniques are useful for both 
kinds of systems, researchers in both areas have developed special-purpose techniques for their particular 
environment. In this paper, we will concentrate on the techniques that are of interest for distributed database 
systems, and will not discuss techniques which are specifically used in parallel database systems (e.g., special 
parallel join methods, repartitioning of data during query execution, etc.). An excellent overview on parallel 
database systems is given in DeWitt and Gray [1992]. 

In terms of related work, there have been several surveys on distributed query processing; for example, a paper by 
Yu and Chang [1984] and parts of the books by Ceri and Pelagatti [1984], Ozsu and Valduriez [1999], and Yu and 
Meng [1997] are devoted to distributed query processing. These surveys, however, are mostly focused on the 
presentation of the techniques used in the early prototypes of the 1970 and 1980. While there is some overlap, 
most of the material presented in this paper is not covered in those articles and books simply because the 
underlying technology and business requirements have significantly changed in the last few years. 

1.3 Organization of this Paper 

This paper is organized as follows: Section 2. presents the textbook architecture for query processing and a series 
of basic query execution techniques that are useful for all kinds of distributed database systems 

Section 3. takes a closer look at query processing for one particular and very important class of distributed 
database systems: client-server database systems Section 4. deals with the query processing issues that arise in 
heterogeneous database systems, that is, systems that are composed of several autonomous component 
databases with different schemas, varying query processing capabilities, and application programming interfaces 
(APIs) 

Section 5. shows how data placement (i.e., replication and caching) and query processing interact and shows how 
data can dynamically and automatically be distributed in a system in order to achieve good performance 

Section 6. describes other emerging and promising architectures for distributed data processing; specifically, this 
section gives an overview of economic models for distributed query processing and dissemination-based 
information systems 

Section 7. contains conclusions and summarizes open problems for future research. 
2. DISTRIBUTED QUERY PROCESSING: BASIC APPROACH AND TECHNIQUES 

In this section, we will describe the "textbook" architecture for query processing and present a series of specific 
query processing techniques for distributed database and information systems. These techniques include 
alternative ways to ship data from one site to one or several other sites, implement joins, and carry out certain kinds 
of queries in a distributed environment. The purpose of this section is to give an overview of basic mechanisms that 
can be used in any kind of distributed database system. In Sections 3. and 4., we will discuss the techniques that 
are particularly useful for certain classes of distributed database systems (i.e., client-server and heterogeneous 
database systems). 
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2.1 Architecture of a Query Processor 

Figure 1 shows the classic "textbook" architecture for query processing. This architecture was used, for example, in 
IBM's Starburst project [Haas et al. 1989]. This architecture can be used for any kind of database system including 
centralized, distributed, or parallel systems. The query processor receives an SQL (or OQL) query as input, 
translates and optimizes this query in several phases into an executable query plan, and executes the plan in order 
to obtain the results of the query. If the query is an interactive ad hoc query (dynamic SQL), the plan is directly 
executed by the query execution engine and the results are presented to the user. If the query is a canned query 
that is part of an application program (embedded SQL), the plan is stored in the database and executed by the 
query execution engine every time the application program is executed [Chamberlin et al. 1981]. Below is a brief 
description of each component of the query processor. Parser. In the first phase, the query is parsed and translated 
into an internal representation (e.g., a query graph [Jenq et al. 1990; Pirahesh et al. 1992]) that can be easily 
processed by the later phases. The development of parsers is well understood [Aho et al. 1 987], and tools like flex 
and bison can be used for the construction of SQL or OQL parsers just as for most other programming languages. 
The same parser can be used for a centralized and distributed database system. Query Rewrite. Query rewrite 
transforms a query in order to carry out optimizations that are good regardless of the physical state of the system 
(e.g., the size of tables, presence of indices, locations of copies of tables, speed of machines, etc.) [Pirahesh et al. 
1992]. Typical transformations are the elimination of redundant predicates, simplification of expressions, and 
unnesting of subqueries and views. In a distributed system, query rewrite also selects the partitions of a table that 
must be considered to answer a query [Ceri and Pelagatti 1984; Ozsu and Valduriez 1999]. Query rewrite is carried 
out by a sophisticated rule engine [Pirahesh et al. 1992]. Query Optimizer. This component carries out 
optimizations that depend on the physical state of the system. The optimizer decides which indices to use to 
execute a query, which methods (e.g., hashing or sorting) to use to execute the operations of a query (e.g., joins 
and group-bys), and in which order to execute the operations of a query. The query optimizer also decides how 
much main memory to allocate for the execution of each operation. In a distributed system, the optimizer must also 
decide at which site each operation is to be executed. To make these decisions, the optimizer enumerates 
alternative plans (described below) and chooses the best plan using a cost estimation model. Almost all 
commercial query optimizers are based on dynamic programming in order to enumerate plans efficiently. Dynamic 
programming and considerations for cost estimation in a distributed system are described in more detail in Section 
2.2. Plan. A plan specifies precisely how the query is to be executed. Probably every database system represents 
plans in the same way: as trees. The nodes of a plan are operators, and every operator carries out one particular 
operation (e.g., join, group- by, sort, scan, etc.). The nodes of a plan are annotated, indicating, for instance, where 
the operator is to be carried out. The edges of a plan represent consumer-producer relationships of operators. 
Figure 2 shows an example plan for a query that involves Tables A and B. The plan specifies that Table A is read at 
Site 1 using an index (the idxscan(A) operator), B is read at Site 2 without an index (the scan(B) operator), A and B 
are shipped to Site 0 (the send and receive operators), B is materialized and reread at Site 0 (the temp and scan 
operators), and finally, A and B are joined at Site 0 using a nested-loop join method (the NLJ operator). The send 
and receive operators encapsulate all the communication activity so that all other operators (e.g., NLJ or scan) can 
be implemented and used in the same way as in a centralized database system. Plan Refinement/Code 
Generation. This component transforms the plan produced by the optimizer into an executable plan. In System R, 
for example, this transformation involves the generation of an assembler-like code to evaluate expressions and 
predicates efficiently [Lorie and Wade 1979]. In some systems, plan refinement also involves carrying out simple 
optimizations which are not carried out by the query optimizer in order to simplify the implementation of the query 
optimizer. Query Execution Engine. This component provides generic implementations for every operator (e.g., 
send, scan, or NLJ). All state-of-the-art query execution engines are based on an iterator model [Graefe 1993]. In 
such a model, operators are implemented as iterators and all iterators have the same interface. As a result, any two 
iterators can be plugged together (as specified by the consumer- producer relationship of a plan), and thus, any 
plan can be executed. Another advantage of the iterator model is that it supports the pipelining of results from one 
operator to another in order to achieve good performance. Catalog. The catalog stores all the information needed in 
order to parse, rewrite, and optimize a query. It maintains the schema of the database (i.e., definitions of tables, 
views, user-defined types and functions, integrity constraints, etc.), the partitioning schema (i.e., information about 
what global tables have been partitioned and how they can be reconstructed), and physical information such as the 
location of copies of partitions of tables, information about indices, and statistics that are used to estimate the cost 
of a plan. In most relational database systems, the catalog information is stored like all other data in tables. In a 
distributed database system, the question of where to store the catalog arises. The simplest approach is to store 
the catalog at one central site. In wide-area networks, it makes sense to replicate the catalog at several sites in 
order to reduce communication costs. It is also possible to cache catalog information at sites in a wide-area 
network [Williams et al. 1981]. Both replication and caching of catalog information are very effective because 
catalogs are usually quite small (hundreds of kilobytes rather than gigabytes) and catalog information is rarely 
updated in most environments. In certain environments, however, the catalog can become very large and be 
frequently updated. In such environments, it makes sense to partition the catalog data and store catalog data 
where it is most needed. For example, catalogs of distributed object databases need to know where copies of all 
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the objects (potentially millions) are stored, and they need to update this information every time an object is 
migrated or replicated. Such catalogs can be implemented in a hierarchical way as described in Eickler et al. 
[1997]. 




Fig. 2. 



It should be noted that the architecture shown in Figure 1 and described in this subsection is not the only possible 
way to process queries. There is no such thing as a perfect query processor. An alternative architecture has, for 
example, been developed by Graefe and others as part of the Exodus, Volcano, and Cascades projects [Graefe 
1995; Graefe and McKenna 1993; Graefe and DeWitt 1987], and is used in several commercial database products 
(e.g., Microsoft SQL Server 7.0). In that architecture, query rewrite and query optimization are carried out in one 
phase. Furthermore, there have been proposals to optimize a set of queries rather than individual queries [Sellis 
1988]. The advantage of such an approach is that common subexpressions (e.g., joins) that are part of several 
queries need only be carried out once for the whole set of queries. 




Fig. 3 
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2.2 Query Optimization We now turn to a description of techniques that can be used to implement the query 
optimizer of a distributed database system. We will first describe the most popular enumeration algorithm for query 
optimization. After that, we will describe two cost models that can be used to estimate the cost of a plan. 
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2.2.1 Plan Enumeration with Dynamic Programming. A large number of alternative enumeration algorithms have 
been proposed in the literature; Steinbrunn et al. [19971 contains a good overview, and Kossmann and Stocker 
[2000] evaluate the most important algorithms for distributed database systems. In the following, dynamic 
programming is described. This algorithm is used in almost all commercial database products, and it was pioneered 
in IBM's System R project [Selinger et al. 1979]. The advantage of dynamic programming is that it produces the 
best possible plans if the cost model is sufficiently accurate. The disadvantage of this algorithm is that it has 
exponential time and space complexity so that it is not viable for complex queries; in particular, in a distributed 
system, the complexity of dynamic programming is prohibitive for many queries. An extension of the dynamic 
programming algorithm is known as iterative dynamic programming. This extended algorithm is adaptive and 
produces as good plans as basic dynamic programming for simple queries and "as good as possible plans" for 
complex queries for which dynamic programming is not viable. We do not describe this extended algorithm in this 
paper and refer the interested reader to Kossmann and Stocker [2000]. 3. CLIENT-SERVER DATABASE 
SYSTEMS We now turn to specific classes of distributed systems: systems with a client-server architecture. We will 
first characterize different kinds of clientserver systems and then deal with one of the crucial questions for query- 
processing in these systems: if and how to exploit the resources of client machines. We will then discuss query 
optimization and query execution issues, and present several techniques that are popular for query processing in a 
client-server environment. Some of the techniques presented in this section are also applicable to other system 
architectures. These techniques are presented in this section because they are mostly used by client-server 
database systems. 

3.1 Client-Server, Peer-to-Peer, and Multitier Architectures In general, client-server (or master-slave) refers to a 
class of protocols that allows one site, the client, to send a request to another site, the server, which sends an 
answer as a response to this request [Tanenbaum 1992]. Using this mechanism, it is possible to implement a 
variety of different database architectures. 

Peer-to-peer. This is the most general architecture. In peer-to-peer systems every site can act as a server that 
stores parts of the database and as a client that executes application programs and initiates queries. (Strict) client- 
server. In a strict clientserver system every site has the fixed role of always acting either as a client (query source) 
or as a server (data source). In such a strict client-server architecture, not all the sites can communicate with each 
other: typically, two clients do not interact and often servers do not interact either. 

Middleware, multitier In such an architecture, the sites are organized in a hierarchical way. Every site plays the role 
of a server for the sites at the upper level and the role of a client for the lower-level sites. Thus, a site in one of the 
middle tiers can only communicate with its clients at the level above or its servers at the level below; typically, a site 
cannot communicate with sites at the same or any other level. 
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Many examples for distributed database systems with these kinds of architecture can be found. SHORE [Carey et 
al. 1994] is an example of a system with a peer-to-peer architecture; SHORE is an experimental distributed 
database system developed at the University of Wisconsin. Most commercial database systems today have a strict 
client-server architecture. Compared to a peer-to-peer architecture, one advantage of a strict separation between 
client and server machines is that only server machines need to be administered (i.e., backed up). Also, security 
issues can be addressed by controlling the server machines and the client-server communication links. Another 
advantage is that client and server machines can be equipped according to their specific purposes. Client machines 
are often PCs with good support for graphical user interfaces whereas server machines are usually more powerful 
with multiple processors, large disks (possibly RAID), and very good I/O performance. An example for a three-tier 
middleware system is an Intranet with clients running a WWW browser and one or several WWW servers that are 
connected to database back-end servers. Another example of a middleware system is SAP R/3 [Buck-Emden and 
Galimow 1996]. SAP is the market leader for business application software (ERP). SAP R/3 installations consist of 
at least three tiers: (1) presentation servers, which drive the GUIs of the users' desktops, (2) application servers, 
which implement the business application logic, and (3) database backend servers, which store all the data. 
Integrating functionality from different vendors is one reason to use a middleware architecture (i.e., different 
functionality is provided at different layers of the system). Scalability can be another reason to use a middleware 
architecture: at every tier, additional sites (i.e., processors) can be added in order to deal with a heavier load. 

In the remainder of this section, we will describe query processing techniques that are applicable for all three 
architectures. For easier presentation and to avoid confusion with the terms client and server, we will concentrate 
on the strict client-server architecture and assume that every site has the fixed role of acting either as a client or as 
a server while processing a query. Nevertheless, all techniques are applicable to all three architectures because all 
three architectures are based on the same paradigm in which query sites and data sites can be different. 

3.2 Exploiting Client Resources The essence of client-server computing is that the database is persistently stored 
by server machines and that queries are initiated at client machines. The question is whether to execute a query at 
the client machine at which the query was initiated or at the server machines that store the relevant data. In other 
words, the question is whether to move the query to the data (execution at servers) or to move the data to the 
query (execution at clients). Another related question is whether and how to make use of caching (e.g., to 
temporarily store copies of data at client machines), in this section we will present and discuss the trade-offs 
between alternative approaches which are commonly used in existing systems today. 3.2.1 Query Shipping. The 
first approach is called query shipping. Query shipping is used in many relational and objectrelational database 
systems today (e.g., IBM DB2, Oracle 8, and Microsoft SQL Server). The principle of query shipping is to execute 
queries at servers (i.e., at the lowest level possible in a hierarchy of sites). Figure 8 illustrates query shipping in a 
system with one server. A client ships the SQL (or OQL) code of a query to the server; the server evaluates the 
query and ships the results back to the client. In systems with several servers, query shipping works only if there is 
a middle-tier site that carries out joins between tables stored at different servers or if there are gateways between 
the servers so that intersite joins can be carried out at one of the servers. 3.2.2 Data Shipping. The exact opposite 
of query shipping is data shipping, which is used in many object-oriented database systems (e.g., ObjectStore and 
02). In this approach, queries are executed at the client machine at which the query was initiated and data is 
rigorously cached at client machines in main memory or on disk [Franklin et al. 1993]. That is, copies of the data 
used in a query are kept at a client so that these copies can be used to execute subsequent queries at that client. 
Caching is typically carried out in the granularity of pages (i.e., 4K or 8K blocks of tuples) [DeWitt et al. 1990],2 and 
it is possible to cache individual pages of base tables and indices [Lomet 1996; Zaharioudakis and Carey 1997]. To 
illustrate data shipping, consider the example shown in Figure 9, where some pages of Tables A and B are already 
cached at the client (represented by the dashed boxes in the figure). The scan operators at the client use these 
cached copies of pages and fault in all the pages of A and B that are not cached. 3.2.3 Hybrid Shipping. Neither 
data shipping nor query shipping is the best policy for query processing in all situations. The advantages of both 
approaches can be combined in a hybrid shipping architecture [Franklin et al. 1996]. Hybrid shipping provides the 
flexibility to execute query operators on client and server machines, and it allows the caching of data by clients. The 
approach is illustrated in Figure 10, where the scan(A) and join operators are carried out at the client whereas the 
scan(B) operator is carried out at the server. The scan(A) operator uses the client's cache as much as possible and 
ships to the client only those parts of A that are not in the cache. In contrast, the scan(B) operator neither uses nor 
changes the state of the client's cache. (Section 5. contains more information about the impact of query operators 
on caching.) Today, hybrid shipping is used in some database products such as UniSQL [D'Andrea and Janus 
1996], application systems such as SAP R/3, database research prototypes such as ORION-2 [Jenq et al. 1990] 
and KR,ISYS [Dessloch et al. 1998], and to some extent, in heterogeneous systems such as Garlic [Carey et al. 
1995], Mind [Dogac et al. 1996], TSIMMIS [Papakonstantinou et al. 1995a], and DISCO [Tomasic et al. 1998] 
(Section 4.). 3.2.4 Other Hybrid Shipping Variants. For application programs that carry out SQL-- style queries and 
C++-style methods, one special and restricted variant of hybrid shipping is to execute the SQL-style queries at the 
servers, without caching, and the C++-style methods at the clients, using caching. Such an approach has been 
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proposed, for example, as part of the KRISYS project [Harder et al. 1995]. Persistence is a product that supports 
this approach [Keller et al. 1993]. This approach is reasonable because caching and client-side execution are 
particularly effective for methods that repeatedly access the same objects in order to carry out complex 
computations. Queries that involve a great deal of data, on the other hand, can often be executed more efficiently 
at server machines without making use of client-side caching. 




Fig. 8. Fig. 9. 
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Fig. 10. 



Another variant of hybrid shipping is used by certain decision support products (e.g., products by MicroStrategy). 
These products have a three-tier architecture. The bottom tier is a standard relational database system that stores 
the database and carries out join processing and other standard relational operations. The middle tier then carries 
out nonstandard operations for decision support like (moving averages, rollup, drill-down, etc.) [Gray et al. 1996; 
Kimball and Strehlo 1995]. Again, such an architecture is a special hybrid shipping variant because query 
processing is carried out at servers and at middle-tier machines, and the difference from fullfledged hybrid shipping 
is that not all operations can be carried out at all the machines/tiers. 3.2.5 Discussion. The performance tradeoffs of 
query, data, and hybrid shipping have been studied in Franklin et al. [1996]. Many of the effects are obvious. Query 
shipping performs well if the server machines are powerful and the client machines are rather slow. On the negative 
side, query shipping does not scale well if there are many clients because the servers are potential bottlenecks in 
the system. Data shipping scales well because it uses the client machines, but data shipping can be the cause of 
very high communication costs if caching is not effective and a great deal of unfiltered base data must be shipped 
to the clients. Obviously, hybrid shipping has the potential at least to match the best performance of data shipping 
and query shipping by exploiting caching and client resources such as data shipping if that is beneficial, or 
otherwise by behaving like query shipping. In some situations, hybrid shipping will show better performance than 
both data and query shipping by exploiting client and server machines and intraquery parallelism to execute a 
query. The price for this improved flexibility is that query optimization is significantly more complex in a hybrid 
shipping system than in a query or data shipping system because the optimizer must consider more options. The 
experiments of Franklin et al. [1996] and other studies demonstrate three other less obvious effects for hybrid 
shipping systems: Sometimes it is better to read data from the servers' disks in a hybrid-shipping system even if the 
data are cached at the client. Consider, for example, a join query that involves two tables that are stored at two 
different servers and assume that these tables are cached on the client's disk and that the network is fast. The best 
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way to execute this query might be to read both tables from the servers* disks (rather than from the client's disk 
cache) and to execute the join at the client. This way, reading the data from the servers' disks and join processing 
with the client's disks) do not interfere with each other. 



Sometimes the best strategy to execute a query in a hybrid shipping system involves shipping cached base data or 
intermediate query results from the client to a server. Such a strategy, for example, is useful in situations in which 
the data are cached in the client's main memory, the network is fast, and join operations can be carried out most 
efficiently at the server. Transactions that involve several small update operations should be carried out at clients, 
thereby putting the new versions of tuples into the client's cache. Such an approach, for example, is used 
extensively by SAP R/3 [Buck-Emden and Galimow 1996; Kemper et al. 1998]. The advantage is that such 
transactions can be rolled back at clients without affecting the server and that the updates can be propagated to the 
server in one batch with fairly little overhead [Bogle and Liskov 1994; OToole and Shrira 1994]. Transactions that 
involve updating large amounts of data (e.g., give all Emps a 10% salary increase), on the other hand, should be 
carried out directly at the servers) that store the affected data. This way, the original Emp table need not be shipped 
from the server to the client and the updated Emp table need not be shipped back to the server either. In all the 
experiments presented in Franklin et al. [1996], the other hybrid shipping variants described in Section 3.2.4 
perform just like query shipping and perform poorly in many situations. In general, these restricted hybrid shipping 
variants may perform well for some workloads, just like data or query shipping, but only full-fledged hybrid shipping 
is able to perform well for any kind of workload. 

3.3 Query Optimization Having described query, data, and hybrid shipping as fundamentally different approaches 
for query processing, we will now show how query optimizers for query, data, and hybrid shipping systems can be 
built and describe several alternative query optimization strategies. 

3.3. 1 Site Selection. From the perspective of a query optimizer, data shipping, query shipping, and hybrid shipping 
can be modeled by the options they allow for site selection. Every operator of a plan has a site annotation, which 
indicates where the operator is to be executed. Table I shows the possible site annotations for different classes of 
query operators and the three alternative approaches. The table shows the possible annotations for client-server 
and peer-to-peer systems; analogous annotations can be used for multitier systems. In all three approaches, 
display operators that pass the results of select queries to application programs obviously need to be carried out at 
the client which issued the query. For all other operators, the options of the three approaches are different. Data 
shipping carries out all operators at the client (i.e., at the site at which the data is consumed). In contrast, query 
shipping carries out all the operators at servers (i.e., at sites at which the data is produced). Hybrid shipping allows 
the optimizer to annotate operators in any way allowed by data or query shipping. The special hybrid variant for 
decision support and OLAP could be characterized by specifying that scans have server site annotations, joins and 
other standard relational operators have producer site annotations like query shipping, and all the other operators 
(e.g., moving average) have consumer site annotations like data shipping. 

All site annotations are logical. A client site annotation indicates that the operator is to be carried out by the client 
that issues the query; such an annotation does not indicate that the operator is carried out by a specific Machine x. 
Likewise, a consumer (producer) annotation indicates that the operator is carried out at the same site as the 
operator that processes the operator's results (input). A server annotation for a scan indicates that the scan is 
carried out at one of the servers that store a copy of the scanned data. A server annotation for an update indicates 
that the update is carried out at all the servers that store a copy of the affected data. 3 These logical site annotations 
are translated into physical addresses when a plan is prepared for execution. As a result, the same plan can be 
used to execute a query at different clients so that a query need not be recompiled for every client individually. If 
there is replication, translating a server annotation for a scan involves selecting one specific server machine. This 
selection can be done heuristically (e.g., the server closest to the client) or in a cost-based manner (Section 3.3.3). 
3.3.2 Where and When to Optimize. There are two questions of particular interest for query optimization in a client- 
server environment. The first question is where a query should be optimized. Hagmann and Ferrari [1986] studied 
alternative approaches in an environment with many clients and one server. They propose carrying out certain 
steps of query processing at the client at which a query originates and other steps at the server. For example, 
parsing and query rewrite could be carried out at the client whereas query optimization and plan refinement could 
be carried out at the server. This approach makes sense because operations such as parsing and query rewrite 
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can very well be executed at the clients so that they do not disturb the server, whereas steps such as query 
optimization require a good knowledge of the current state of the system (i.e., the load on the server) and should, 
therefore, be carried out by the server. In systems with many servers, no single server has complete knowledge of 
the whole system. In such systems, one server needs to carry out query optimization (e.g., the server located 
closest to the client). This server needs to either guess the state of the network and other servers based on 
statistics of the past, or try to discover the load of other servers by asking them for their current load. While asking 
is obviously better than guessing in terms of generating good plans, asking involves at least two extra messages for 
every server that is potentially involved in a query. 

The second question, which is related to the above question, is when to optimize a query. Again, the answer to this 
question determines the accuracy, in this case the recency, of the information about the state of the system that the 
optimizer receives. This question arises for canned queries, that are part of application programs and evaluated 
during the execution of an application program. As already stated in Section 2.1, the traditional approach is to 
compile and optimize these queries at the time the application program is compiled, store plans for these queries in 
the database, and retrieve and execute these plans whenever the application program is executed. When 
something drastic happens that makes the execution of the plan impossible (e.g., when an index used in the plan is 
dropped), the plan stored in the database is invalidated and a new plan must be generated before the application 
program is executed [Chamberlin et al. 1981]. Obviously, this approach cannot adapt to changes such as shifts in 
the load of sites, and the compiled plans show poor performance in many situations. 

More dynamic approaches were proposed by Graefe and others in Graefe and Ward [1989], and Cole and Graefe 
[1994], and by loannidis et al. [1992]. The idea is to generate several alternative plans and/or subplans at compile 
time of the application, store these alternative plans and subplans in the database, and choose the plan or 
subplans that best matches the current state of the system just before executing the query. Even more dynamic 
approaches optimize queries on the fly. The idea is to start executing a compiled or dynamically chosen plan and 
observe whether intermediate query results are produced and delivered at the expected rate. If the expectations 
are not met, then the execution of the plan is stopped, intermediate results are materialized, and the optimizer is 
called to find a new plan for those parts of the query that still need to be carried out. Urhan et al. [1998] show how 
such a reoptimization approach can be very useful to improve the response time of queries in situations in which 
the arrival of data from certain servers is delayed or bursty because those servers are heavily loaded or the 
communication links are congested. For this purpose, the approach reorders and reschedules operations at the 
client so that the client carries out other operations while waiting for the delayed data. In another paper, Kabra and 
DeWitt show how such a reoptimization approach helps in situations in which the initial plan performs poorly 
because it was based on wrong estimates of the size of tables and intermediate query results [Kabra and DeWitt 
1998]. 

Ozcan et al. [1996; 1997] proposed another dynamic/on-the-fly query optimization approach. In that approach, 
queries are optimized and executed in two phases. First, a query is decomposed. This means that the query is 
divided into a set of subqueries that can each be executed by a single server. The final query result is composed by 
joining the results of the subqueries by the client or a middle-tier machine. Query decomposition for this purpose is 
described in Evrendilek et al. [1997]. The subqueries are processed by the servers in parallel. The order (i.e., 
schedule) in which the results of the subqueries are joined at the client depends on the speed in which the servers 
produce subquery results and the selectivity and cost of the joins which need to be carried out to combine the 
subquery results. Ozcan et al. propose a heuristic approach to decide whether to join the subquery results 
produced by two fast servers immediately or to delay a join and wait for the delivery of other subquery results from 
slower servers first. The goal is to parallelize work at the client with work at slow servers as much as possible, as in 
the reoptimization work of Urhan et al. [1 998], and also to avoid the execution of very expensive joins that may 
result from poor join ordering. 3.3.3 Two-Step Optimization. Two-step query optimization is an approach that has 
become popular for both distributed and parallel database systems [Carey and Lu 1986; Du et al. 1995; Ganguly et 
al. 1996; Hasan and Motwani 1995; Hong and Stonebraker 1990; Stonebraker et al. 1996; Thomas et al. 1995]. 
Two-step optimization is an alternative to the dynamic approaches presented in the previous section because it 
carries out certain decisions just before a query is executed. Two-step optimization also reduces the overall 
complexity of distributed query optimization. Several variants of twostep optimization exist. For distributed systems, 
the basic variant of two-step optimization works as follows: 1. At compile time, generate a plan that specifies the 
join order, join methods, and access paths. 
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Fig. 11. 



2. Every time just before the query is executed, transform the plan and carry out site selection (i.e., determine 
where every operator is to be executed). Both steps can be carried out by dynamic programming or any other 
enumeration algorithm (Section 2.2.1). Two-step optimization has a reasonable complexity because both steps can 
be carried out with reasonable effort. The first step has essentially the same, mostly acceptable, complexity as 
query optimization in a centralized database system. The second step also has acceptable complexity because it 
only carries out site selection. Furthermore, two-step optimization is useful to balance the load on a distributed 
system because executing operators on heavily loaded sites can be avoided by carrying out site selection at 
execution time [Carey and Lu 1986]. Two-step optimization is also useful to exploit caching in a hybrid shipping 
system because query operators can dynamically be placed at a client if the underlying data is cached by the client 
[Franklin et al. 1996]. On the negative side, two-step query optimization can result in plans with unnecessarily high 
communication cost. To see why, consider the example shown in Figure 1 1 . The plan in (a) shows the join ordering 
carried out in the first step of two-step optimization; the plan in (b) shows the result of site selection in the second 
step; and the plan in (c) shows an optimal plan for this query. In the second and third plans, the site annotations are 
indicated by the shading of the operators. Tables A and D are colocated at one server (the darkly shaded server), 
Tables B and C are colocated at another server (the lightly shaded server), and the result of the query must be 
displayed at a client workstation (the unshaded site). The second plan, obtained using two-step optimization, has a 
higher communication cost than the optimal plan because the first step of two-step optimization was carried out 
ignoring the location of data and the impact of join ordering on communication cost in a distributed system. 3.4 
Query Execution Techniques Most of the query execution techniques presented in Section 2.3 are useful in a client- 
server environment as well as in any other distributed database system. Row blocking, for example, is essential to 
ship data from servers to clients and from clients to servers, and it has been implemented in almost all commercial 
database systems. Also, it is often attractive to carry out operations at the client in a multithreaded way. In fact, 
Web browsers like Netscape's Navigator load individual components such as text and images of a Web page in a 
parallel and multithreaded way. 

One particular issue that arises in hybrid shipping systems is how to deal with transactions that first update data in 
a client's cache and then execute a query at a server that involves the updated data. For example, consider a 
transaction that first updates the salary of John Doe and then asks for the average salary of all employees. The 
update is likely to be executed at the client at which the transaction was started in order to batch updates as 
described in Section 3.2.5. On the other hand, the optimizer will probably decide to execute the second query at the 
server that stores the Emp table in order to avoid the cost of shipping the whole Emp table to the client. The point is 
that the computation of the average salary must consider the new salary of John Doe, which is known at the client 
but not at the server. There are two possible solutions: Propagate all relevant updates such as John Doe's new 
salary to the server just before starting to execute the query at the server [Kim et al. 1990]. 

Carry out the query at the server and pad the results returned by the server at the client using the new value of 
John Doe's salary-for example, such an approach can be carried out using one of the techniques proposed in 
Srinivansan and Carey [1992]. In either case, carrying out the query at the server involves additional costs; these 
additional costs should be taken into account by a dynamic or two-step optimizer in order to decide whether it is 
cheaper to carry out the query at the server or at the client Such issues do not arise in query shipping and data 
shipping systems. Query shipping systems do not support client-side caching and hatched updates, and data 
shipping systems carry out all query operators at the client using the latest cached versions of data. 

4. HETEROGENEOUS DATABASE SYSTEMS This section shows how queries can be processed in 
heterogeneous database systems.' The purpose of such systems is to enable the development of applications that 
need to access different kinds of component databases (e.g., image and other multimedia databases, relational 
databases, object-oriented databases, or WWW databases). One characteristic of heterogeneous database 
systems is that the individual component databases can have different capabilities to store data, carry out database 
operations (e.g., joins and group-bys), and/or communicate with other component databases of the system. For 
example, a relational database is capable of processing any kind of join whereas a WWW database is typically only 
capable of processing a specific predefined set of queries. One of the challenges, therefore, is to find query plans 
that exploit the specific capabilities of every component database in the best possible way and to avoid query plans 
that attempt to carry out invalid operations at a component database. Another challenge is to deal with semantic 
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heterogeneity [Sheth and Larson 1990], which arises, for example, if an application needs the total sales and one 
component database uses DM as a currency while another component database uses Euro. Furthermore, every 
component database has its own specific interface (API), decides autonomously when and how to execute a query, 
and might not be designed to interact with other databases. 

There has been a great deal of work on various aspects of the design and implementation of heterogeneous 
databases. In fact, there have even been excellent tutorials in the past [ACM Computing Surveys 1990], and some 
commercial systems are described in IEEE Data Engineering Bulletin [1998]. In this section, we will therefore 
concentrate on basic technology and recent developments in this area. We will present the architecture that is used 
for most heterogeneous database systems today and discuss how queries can be optimized and executed in 
heterogeneous systems. Again, keep in mind that we are only interested in query processing in this paper. Issues 
such as transaction processing in heterogeneous database systems are beyond the scope of this paper and have 
already been described, for example, in Breitbart et al. [1992]. 




Fig. 12. 



4.1 Wrapper Architecture for Heterogeneous Databases In order to construct heterogeneous database systems, 
several tools have been developed in recent years; examples are DISCO [Tomasic et al. 1998], Garlic [Carey et al. 
1995], Hermes [Adali et al. 1996], TSIMMIS [Papakonstantinou et al. 1995b], Pegasus [Shan et al. 1994], and 
Junglee's VDB technology [Gupta et al. 1 997]. Furthermore, a number of tools have been designed for the specific 
purpose of integrating data from different relational and object-oriented databases (e.g., IBM's Data Joiner, MIND 
[Dogac et al. 1996], and IRO-DB [Gardarin et al. 1996]). An older example is HP's MultiDatabase product [Dayal 
1983]. Essentially, all of these tools have a threetier software architecture as shown in Figure 12. Clients connect to 
a mediator [Wiederhold 1993]. The mediator parses a query, carries out query rewrite and query optimization, and 
executes some of the operations of a query. The mediator also maintains a catalog to store the global schema of 
the whole heterogeneous database system (i.e., the schema used in queries by application programs and users), 
the external schema of the component databases (i.e., which parts of the global schema are stored by each 
component database), and statistics for query optimization. Thus, the mediator has very much the same structure 
as the "textbook" query processor described in Section 2.1. The difference is that an extended query optimization 
approach needs to be used (see Section 4.2) and that certain query execution techniques are particularly attractive 
in the mediator that might not be attractive in other distributed database systems (see Section 4.3). Also, a 
mediator is designed to integrate any kind of component database. That is, a mediator does not contain any code 
that is specific to any one component database and as a result, a mediator cannot directly interact with component 
databases. 

To encapsulate the details of component databases, a wrapper (or adaptor) is associated to every component 
database. The wrapper translates every request of the mediator so that the request is understood by the 
component database's API, and the wrapper also translates the results returned by the component database so 
that the results are understood by the mediator and are compliant with the external schema of the component 
database and the global schema of the heterogeneous database. For example, a wrapper of a WWW database 
(e.g., amazon.com) that returns html pages (e.g., lists of books) must filter out the useful information (e.g., author, 
title, price, order information) from the html pages. Another example is the wrapper for a sales database that uses 
DM as currency. This wrapper must convert DM into Euro, if Euro is the currency used in the global schema of the 
heterogeneous database. In some cases, wrappers also implement special techniques such as row blocking or 
caching to improve performance. In addition, as described in the next section, wrappers also participate in query 
optimization. 
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Obviously, wrappers are fairly complex pieces of software, and it is not unusual for it to take several months to 
develop a wrapper. The TSIMMIS and Garlic projects have specifically addressed the question of how to make 
wrapper design as cheap as possible [Papakonstantinou et al. 1995b; Roth and Schwarz 1997]. Nevertheless, 
wrapper development is expensive. The good news is that similar wrappers work for many different kinds of 
component databases so that it is quite easy to adjust an existing wrapper in order to obtain a wrapper for a new 
component database. Also, as shown in Figure 12, it is possible for several component databases to be handled by 
the same wrapper. Furthermore, with the growing importance and demand for heterogeneous systems, it is quite 
likely that wrappers will be commercially available in the future for many common classes of databases. 
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Fig. 13. 



One feature of the architecture shown in Figure 12 is that it is extensible. At any time, wrappers and component 
databases can be upgraded or new component databases can be integrated without changing the mediator or 
adjusting existing wrappers. Furthermore, the architecture is a software architecture. Wrappers and the mediator 
can be installed at any machines in the system. It is even possible that the mediator is distributed (i.e., that 
separate cooperating instances of the mediator are installed at different machines). 4.2 Query Optimization This 
subsection shows how query optimization can be carried out in a heterogeneous database system. As stated at the 
beginning of this section, one of the challenges of query optimization in a heterogeneous system is that the 
capabilities of the component databases are different. The optimizer of a heterogeneous system must therefore be 
generic and be able to understand what capabilities component databases have. 
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Several alternative approaches for query optimization in heterogeneous database systems have been proposed in 
the literature. One approach is to describe the capabilities of the component databases as views, store the 
definitions of these views in the catalog, and see during query optimization how a query can be subsumed by the 
views registered in the catalog [Levy 1999]. While this approach is quite flexible, it is very difficult to implement. 
Other work has proposed the use of capability records [Levy et al. 1996] or context-free grammars to describe the 
capabilities of queries and the use of various new cost-based and heuristic algorithms to generate plans for a query 
[Papakonstantinou et al. 1996; Tomasic et al. 1998]. In this section, we will focus on an approach that is based on 
existing and well-established query optimization techniques. In this approach, the capabilities of the component 
databases are described by enumeration rules, which are interpreted by the optimizer, and this approach uses 
either dynamic programming or iterative dynamic programming (Section 2.2.1) in order to find a good plan for a 
query with reasonable effort. This approach was described in full detail in Haas et al. [1997]. It was implemented for 
the Garlic system at IBM. 1. This approach relies on wellestablished distributed database technology; the use of 
dynamic programming or iterative dynamic programming will generate good plans with reasonable effort just as in 
any other distributed database system. Using the same technology as most existing database products also gives 
vendors an easy migration path to adapt products for heterogeneous database systems. 
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2. This approach is very flexible so that the capabilities of the component databases can be modeled very 
accurately. For example, it is possible to write enumeration rules that model gateways between different component 
databases or replication of tables at different component databases. 

3. It is usually fairly easy to implement the enumeration rules of a wrapper. The simple enumeration rules shown in 
Figures 13 through 15 are actually used in the Garlic project in order to integrate relational databases and Web 
databases such as BigBook. Enumeration rules and planning functions for wrappers can be very simple because 
these enumeration rules describe what kind of operations can be carried out by a component database rather than 
exactly how these operations are to be carried out. 4. It is possible to define very simple enumeration rules for a 
new wrapper at the beginning and to add more sophisticated enumeration rules once the wrapper is operational. In 
fact, some very simple generic rules exist that can be used to integrate any new wrapper and component database 
[Haas etal. 1997]. 

5. New wrappers with any kind of enumeration rules can be integrated into the system and the enumeration rules of 
an existing wrapper can be altered without adjusting the enumeration rules of other wrappers or the mediator and 
without adjusting any other component of the system. 4.2.2 Cost Estimation for Plans. Having described how 
alternative query evaluation plans can be enumerated in a heterogeneous database system, we now turn to the 
question of how to estimate the cost or response time of these plans. Both the classic and the response time cost 
models presented in Section 2.2.2 can be used for this purpose, and the cost or response time of the individual 
operators that are to be carried out by the mediator can be estimated just as in any other distributed database 
system because the mediator uses standard, well-understood algorithms to execute joins, group-bys, and so on. 
The challenge is to estimate the cost or response time of wrapper plans that are to be carried out by the component 
databases because the details of how a component database executes such a plan (i.e., a subquery) might not be 
known. 



Estimating the cost of wrapper plans in heterogeneous database systems is still an open research issue. There are 
three alternative approaches, which differ in the accuracy of the estimates and in the amount of required effort by 
wrapper developers. We will briefly describe these three approaches below. Experiments that demonstrate the 
importance of accurate cost estimations have been presented in Roth et al. [1999]. Calibration Approach. The first 
approach is called the calibration approach. The idea is to define a generic cost model for all wrappers and adjust 
certain parameters of this cost model for every individual wrapper and component database by executing a set of 
test queries. This way, the specific hardware and software characteristics of a wrapper and a component database 
can be taken into account. For example, a very simple generic model would be to estimate the cost of a wrapper 
plan as where n is the estimated number of tuples returned by the wrapper plan (i.e., n depends on the query) and 
c is the wrapper/component database specific parameter, which would be small for very fast component databases 
and large for slow component databases or component databases that are only reachable by a slow 
communication link. 

To date, several generic cost models and sample queries have been proposed to implement the calibration 
approach for heterogeneous databases (e.g., Du et al. [1992], Zhu and Larson [1994], Gardarin et al. [1996], and 
Roth et al. [1999]). The generic cost models described in that work are significantly more complex than the simple 
example we gave above. These cost models typically define special cost formulas for single-table queries, 
multitable queries, indexed and nonindexed queries, and so on. The big advantage of the calibration approach is 
that wrapper developers need not worry much about costing issues when they design a new wrapper and/or 
integrate a new component database into the heterogeneous database. The generic cost model is predefined as 
part of the mediator, and the calibration of the generic cost model for a new wrapper and component database can 
be carried out automatically or semiautomatically using the predefined test queries. The big disadvantage of the 
calibration approach is that not all component databases can be tweaked into a generic cost model. The generic 
cost models proposed in Du et al. [1992], Zhu and Larson [1994], and Gardarin et al. [1996], for example, are 
mostly based on observations made with relational or object-oriented database systems, and they are not likely to 
be a good match for the cost of queries executed, say, by the BigBook database. Individual Wrapper Cost Models. 
An alternative to the calibration approach is to define a separate cost model for every wrapper. In this approach, the 
developer of the wrapper not only provides enumeration rules as described in the previous subsection, but also a 
set of cost formulas. One cost formula is associated with every enumeration rule in order to estimate the cost of the 
plans) generated by that rule. Obviously, the big advantage of this approach is that the cost of all wrapper plans 
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can be modeled as accurately as possible or desired. On the negative side, however, this "do-it-yourself approach 
puts a heavy burden on developers of wrappers. To combine the advantages of the calibration approach and this 
do-it-yourself approach, Naacke et al. [1998] proposed an approach in which costing is done by default using the 
calibration approach and wrapper developers are free to overwrite the default and define their own cost functions 
for their specific wrappers if they feel that the calibration approach is not sufficiently accurate for their wrappers and 
component databases. Such a hybrid approach has also been adopted for Garlic [Roth et al. 1999]. Learning Curve 
Approach. The third approach to estimate the cost of wrapper plans is based on monitoring the system and keeping 
statistics about the cost to execute wrapper plans [Adali et al. 1996]. In this approach, for example, the system 
would observe that the last three plans that involved Tables A and B had costs of, say, 1 0 sec, 20 sec, and 9 sec. 
Based on these statistics, the cost model would estimate that the next plan involving A and B costs 13 sec. Similar 
and more sophisticated ideas of query feedback have also been studied in the standard relational context [Chen 
and Roussopoulos 1994]. Like the calibration approach, this approach releases wrapper developers from the 
burden of worrying about costing issues, but it can be very inaccurate. One particular advantage of this approach is 
that it automatically and dynamically adapts to changes in the system that impact the cost of operations (e.g., 
growing tables, hardware upgrades, different load situations). 
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4.3 Query Execution Techniques 4.3.2 Cursor Caching. There are many workloads for which the mediator submits 
the same query, with different parameters, many times to a component database. To implement the tuple-at-atime, 
binding-based nested-loop join, for example, the same query is submitted for every tuple of A. In addition, only four 
different kinds of queries can be submitted to the BigBook database. The idea of cursor caching is to optimize a 
query only once in order to reduce the overhead of submitting the same query to the same component database 
repeatedly. For component database systems that understand JDBC [Hamilton et al. 1997], cursor caching can be 
implemented by using JDBC's prepareStatement command to optimize the query, the set command to pass the 
binding parameters) every time the query is executed, and the executeQuery command to execute the query. 
Cursor caching is another technique that is extensively used by database application systems such as SAP R/3 
[Doppelhammer et al. 1997]. Similar ideas have also been integrated into several DBMS products (e.g., Oracle8 
[Lahiri etal. 1998). 

Cursor caching has the same tradeoffs as static query optimization (Section 3.3.2): on the positive side, cursor 
caching reduces overhead for query optimization; on the negative side, the (cached) plan might not always be the 
best plan to execute a query. In particular, the best plan can depend on the value of the query parameter. This 
effect has been studied for SAP R/3 in Doppelhammer et al. [1997]. 

4.4 Outlook While query processing for homogeneous and client-server databases is fairly well understood 
(Sections 2. and 3.), this is not true for heterogeneous systems. Writing wrappers is a tedious task and query 
optimization is more difficult because the component databases are autonomous, have different capabilities, and 
incur costs which are hard to predict. Nevertheless, products from database vendors (e.g., IBM's Garlic [Carey et 
al. 1995] or HP's Pegasus [Shan et al. 1994]) as well as new start-up companies (e.g., Junglee [Gupta et al. 1997]) 
are already appearing on the market because the management of heterogeneous database systems is extremely 
important in practice. Furthermore, academic research projects are developing new ways in which database and 
application components interoperate (e.g., Relly et al. [1998], and Braumandl et al. [1999b]). 

This section presented a small fraction of the existing work in this area, and there is definitely a great deal of new 
work to come. However, this section showed the most important trend. When designing a heterogeneous database, 
the goal is to encapsulate the heterogeneity of the component databases and to use existing homogeneous 
distributed database technology as much as possible. 

5. DYNAMIC DATA PLACEMENT The previous three sections answered the following question: given a query and 
the location of copies of data and other parameters, how can this query be executed in the cheapest or fastest 
possible way. In this section, we will look at this question from a different perspective and show where copies of 
data should be placed in a distributed system so that the whole query workload can be executed in the cheapest or 
fastest possible way. 

Traditionally, data placement has been carried out statically. With static data placement, a system administrator 
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decides where to place copies of data, speculating what kind of queries might be carried out at what locations in the 
system. To support static data placement, several models and tools that take the expected query workload and 
system topology as input and decide where to place copies of data have been devised (e.g., [Apers 1988]). 
Obviously, static data placement has several weaknesses: (1) the query workload is often not predictable; (2) even 
if the workload can be predicted, the workload is likely to change, and the workload might change so quickly that 
the system administrator cannot adjust the data placement quickly enough; (3) the complexity of a sufficiently 
accurate model for static data placement is too big. (The problem is NP-complete [Apers 1988].) This section is, 
therefore, focussed on dynamic data placement approaches. These approaches keep statistics about the query 
workload and automatically move data and establish copies of data at different sites in order to adjust the data 
placement to the current workload. These approaches do not aim to be perfect, but they try to improve the data 
placement with every move. As in the rest of this paper, concurrency control and consistency issues are not 
addressed. Concurrency control issues that are relevant for dynamic data placement have been addressed in, for 
example, Davidson et al. [1985], Franklin et al. [1997], Lomet [1996], and Zaharioudakis and Carey [1997]. Also, 
this section only presents techniques that decide where copies of base tables or parts thereof and of indices or 
parts thereof should be placed. Techniques that place copies of entries of the catalog at different sites are not 
discussed; such techniques have been specifically studied in Eickler et al. [ 1997]. 5.1 Replication vs. Caching First, 
we would like to establish some terminology. In principle, there are two different mechanisms to establish copies of 
data at different sites of a distributed system: replication and caching. Seen from a high level, replication and 
caching share the same goals: both establish copies of data at different sites in order to reduce communication 
costs and/or balance the load of a system. As shown in Figure 16, however, there are a number of subtle 
differences between replication and caching. First of all, replication takes effect at server machines (i.e., data 
sources) in a client-server environment. That is, replication establishes copies of data at servers based on statistics 
that are kept at servers with the purpose of better meeting the requirements of a potentially large group of clients. 
Caching, on the other hand, takes effect at clients or at middle-tier machines (i.e., query sources),7 and caching is 
based on statistics kept at these machines. Only one client or a small group of clients, therefore, benefit from a 
cached copy of a data item, but on the positive side, caching establishes copies of data directly at the places where 
the data is needed. Also, caching exploits client machine resources which might remain unused without caching 
(Section 3.2). The second difference between replication and caching lies in the granularity of the copies of the 
data. Replication is typically coarse-grained: only a whole table, a whole index, or a whole (horizontal) partition of a 
table or index can be replicated. Replicating data in a coarse granularity is acceptable because a large group of 
clients benefit from replication (as stated above), and it is quite likely that most parts of a table or index will be used 
by this group of clients. Caching, on the other hand, is typically fine-grained: individual pages of a table or index can 
be cached by a client machine, and some systems even allow the caching of individual rows of a table. Caching in 
a fine granularity is important because caching supports the queries of a single client or of a fairly small group of 
clients, and clients tend to be only interested in a small fraction of the data stored in a specific table. 



The next four differences listed in Figure 16 are based on the observation that replication decisions are usually 
more long-term than caching decisions. Again, the background for these differences is that replication is intended to 
support a large group of clients whose overall access behavior does not change as rapidly as the access behavior 
of a single client. First, replication typically involves placing data on servers 1 disks (in part because of the coarse- 
grained nature of replication), whereas a client's working set of data typically fits in the client machine's main 
memory.8 Second, server replicas are registered in the system's distributed catalog so that they can be used by all 
clients, while caching does not affect the catalog. Third, propagation-based protocols are used to keep replicas of 
data consistent and accessible at servers at all times. For caching, on the other hand, it was shown that the best 
way to maintain consistency is to use a protocol that is based on invalidation and removes out-of-date copies from 
a client's cache so that copies of data are only available in a client's cache as long as the data has not been 
updated [Franklin et al. 1997]. Finally, replicas are kept at servers until they are explicitly deleted whereas copies of 
data are kept in a client's cache until they are replaced by copies of other and more interesting data using a 
replacement policy such as LRU or until they are removed from the cache because of invalidation. 

The last difference between replication and caching concerns the mechanism used to establish copies of data. 
Replicas are established by a separate process that copies a table, index, or partition and moves it to the target 
server. Caching, on the other hand, is a by-product of query execution: when a table scan or index scan is 
executed at a client, the client faults in all the pages of the table or index that the client has not cached and, after 
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the scan is complete, the client keeps att the used pages of the table or index in its cache, if the cache is large 
enough (Section 3.2.2). In other words, replication can occur at servers even if no queries are processed by these 
servers, whereas the cache of a client is empty if no queries have been processed by that client. As a 
consequence, caching decisions need to be made by the query processor while replication decisions can be made 
by a separate component that is established at every server and works independently of the query processor. 

Having listed all these differences, one may ask whether one technique is more useful than the other and whether 
both techniques are needed. We know of no study that answers this question completely, but from the discussion it 
should have become clear that caching and replication are complementary techniques and that both should be 
implemented. Replication helps to move data near to a large group of clients so that these clients can access the 
data cheaply the first time they need the data. Caching makes it possible to access data cheaply when the data are 
used repeatedly by the same client. In fact, both replication-also called mirroring-and caching are techniques that 
are frequently used in the WWW and Internet. Another difference between caching and replication is that replication 
is often used in order to improve the reliability of a system in the presence of server or network failures. Due to its 
volatile nature, caching cannot serve this purpose. In this section, however, we will concentrate on the performance 
implications of replication and caching. Finally, migration is a particular form of replication in which a new copy is 
established at the target server and the old copy is removed from the original server. 

5.2 Dynamic Replication Algorithms Several dynamic replication algorithms have been proposed in the literature 
[Bestavros and Cunha 1996; Copeland et al. 1988; Ferguson et al. 1993; Gwertzman and Seltzer 1994; Sidell et al. 
1996; Wolfson et al. 1997]. These algorithms can be classified roughly into two groups: (1) algorithms that try to 
reduce communication costs in a WAN by moving copies of data to servers that are located near clients that are 
likely to use that data, and (2) algorithms that try to replicate hot data in order to balance the load on servers in a 
LAN or in an environment in which communication is cheap (i.e., high bandwidth and low delay). Furthermore, 
some replication algorithms work particularly well if the network is a tree or has some other simple structure, 
whereas other algorithms work well in any kind of network. In this subsection, we will briefly describe one specific 
algorithm, the ADR algorithm [Wolfson et al. 1997], that is targeted to reduce communication costs and works 
particularly well in tree- shaped networks. The ADR algorithm is a good representative of this class of algorithms. 
The ADR algorithm is very simple, has provably good performance in certain environments, and can easily be 
integrated into most distributed systems. Other replication algorithms that help balance the load of a system and 
that are based on completely different ideas are presented in Section 6.1. 



The ADR algorithm is based on the following observation which holds if a propagation based "read-one-write- 
all" (ROWA) protocol is used to synchronize updates and keep replicas consistent: The replication scheme of an 
object-table, index, or partition thereof-should be a connected subgraph in order to minimize the communication 
costs in a tree-shaped hierarchical network. To illustrate this principle, Figure 17 shows a network with ten servers. 
In this network, an object is replicated at Servers 5, 6, and 7 (shaded in Figure 17). Even if the object is rarely 
accessed by the clients of Server 6, the object should nevertheless be replicated at Server 6 if it is replicated at 
Servers 5 and 7. When the object is updated by a client of Server 5, then this update must be propagated via 
Server 6 to Server 7 so that the extra copy of the object at Server 6 can be kept consistent without any additional 
communication cost. Likewise, Server 6's copy of the object can be kept consistent with no additional 
communication cost if the update originates at a client of Server 7, 8, 9, or 1 0. If the object is read regardless 
where, the copy of the object at Server 6 does not hurt either. 

Based on this principle, the ADR algorithm expands and contracts the replication scheme of an object at the 
borders of the replication scheme. In the example of Figure 17, Servers 5 and 7 would keep read and write 
statistics for the object and periodically decide whether the replication scheme should be expanded to Servers 2, 3, 
4, 8, 9, or 10, be contracted, removing the replicas at Servers 5 or 7, or remain unchanged. Specifically, Servers 5 
and 7 periodically carry out the following tests based on their statistics: Expansion Test. For each of their neighbors 
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that is not part of the replication scheme add the neighbor to the replication scheme if more read requests originate 
from clients of that neighbor or from clients connected to servers of the subtree rooted in that neighbor than 
updates originate at other clients. For example, if more read requests originate from clients of Servers 1 and 2 than 
write requests from clients of all other servers, then Server 2 should be added to the replication scheme. 
Contraction Test. Drop the copy if more updates are propagated to that copy than the copy is read. If, for example, 
more updates originate at clients of Servers 6, 7, 8, 9, 10 than read requests originate at Servers 1,2,3, 4, 5, then 
Server 5 should drop its copy of the object. 

If the replication scheme consists of only one server, then this server carries out a "switch test" in addition to the 
expansion test in order to find out whether it might be better to store the only copy of the object at a different server 
(i.e., carry out migration). Of course, to prevent the only copy of the object from being dropped, the contraction test 
must not be carried out if the replication scheme consists of only one server. 

5.3 Cache Investment We will now turn to caching and a method called cache investment [Kossmann et al. 2000). 
Like the ADR algorithm, cache investment keeps statistics and establishes copies of data at clients only if these 
copies promise to be beneficial. Since replication and caching are different, however, there are a number of 
important differences between the ADR algorithm and cache investment, and the ADR algorithm is not directly 
applicable to support caching. There are two basic ideas behind cache investment. The first idea is to carry out 
what-if analyses in order to decide whether it is worth caching parts of a table or index. More precisely, what-if 
analyses are applied in order to (1) compute the cost (i.e., investment) of loading a client's cache with parts of a 
table and/or index, and (2) to compute the benefits of caching parts of a table or index. The second idea is to 
extend the optimizer so that the optimizer decides to execute queries at clients if these queries involve data that 
should be cached at these clients. This way, copies of the data are faulted in at these clients and subsequent 
queries can be executed using the cache. Queries that involve data that should not be cached should be executed 
preferably at servers without extra cost for faulting in data. 
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To illustrate cache investment, consider a client that asks for all Emps with salary > 100,000: Section 3.2.3 
mentions that there are essentially two ways to execute this query in a hybrid-shipping system: at the client or at 
the server. Assuming that there is an index on Emp. salary and that the client's cache is initially empty, evaluating 
this query by using an index scan operator at the client involves faulting in, say, 10 pages of the Emp. salary index 
in order to evaluate the predicate and, say, another 20 pages of the Emp table in order to retrieve the name and 
manager fields of the Emps that qualify. As a result, the overall communication costs are 30 pages if the index scan 
is carried out at the client. If the index scan is executed at the server, the name and manager fields of the resulting 
Emp tuples need to be shipped from the server to the client-let's assume a total of 10 pages. As a result, a 
traditional query optimizer will always decide to execute the index scan at the server. 

In this example, cache investment takes effect if the client repeatedly asks queries that involve Emps with high 
salaries. In this case, cache investment advises the optimizer at one point to generate a plan that executes the 
index scans for these Emps at the client. That plan is suboptimal (as described above), but the execution of that 
plan brings the relevant Emp index and table pages into the client's cache so that subsequent queries asking for 
Emps with high salaries can be carried out at the client with no communication cost. Without cache investment, the 
optimizer would execute all queries at the server, no data would be cached at the client, and every query would 
involve some communication cost to ship query results from the server to the client. Taking a closer look, cache 
investment makes the following two calculations for every query issued at a client: 1 . The investment to load the 
cache with the relevant index and table pages for highly paid Emps is 20 pages for our example query; 20 is the 
difference in cost between the suboptimal, clientside plan that brings the pages to the client's cache and the 
optimal, serverside plan. The investment might be higher or lower for other queries depending on, among others, 
the selectivity of the predicates of the WHERE clause and the number of columns of the query result. 2. The benefit 
of caching all relevant pages to extract the highly-paid Emps is 10 pages for our example query; 10 is the difference 
in cost between the best plan for the query given that none of the relevant pages are cached, and the cost of the 
best plan assuming that all relevant pages are cached. Again, the benefit of caching might be higher or lower 
depending on the selectivity of the predicate and the target columns of the query. As a result of these calculations, 
cache investment discovers that after three "high- salary" queries, the benefits of caching outweigh the investment. 
After three queries, cache investment will thus advise the optimizer to generate a suboptimal plan in order to load 
the cache with the relevant Emp data. 
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Quite a few more details need to be taken into account to make cache investment work properly; for instance, the 
exact interaction of cache investment and query optimization, dealing with updates, limitations in the size of a 
client's cache, lightweight strategies to estimate the benefits and investment of caching, cost formulas for clustered 
and unclustered indices, considering response time rather than communication costs in the calculations, and 
keeping statistics in the presence of rapidly changing client workloads. All these details have been described in 
Kossmann et al. [2000], so we will not discuss them here. To conclude, here again are the differences between 
caching with cache investment and replication with the ADR algorithm: Caching is fine-grained, making it possible 
to cache only a few, frequently used pages of large tables or indices as in the example above. Shipping, caching, 
and keeping consistent copies of the whole Emp table at all the clients that frequently ask for Emp information is 
usually not practical. 

The investment to establish a copy is significantly lower with caching than with replication because caching takes 
effect when data is read from disk and shipped to the client in order to execute a query. In our example, the 
investment was 20 pages although 30 pages had to be shipped to the client. Replication always pays the full price 
of 30 pages (or even more due to its coarse granularity) to establish a copy because the replication process does 
not overlap with the execution of queries. 

As mentioned in Section 5.1 , however, probably both replication with the ADR algorithm (or some other algorithm) 
and caching with cache investment (or some similar technique) should be used because caching and replication 
take effect at different "ends" of the system. 5.4 View Caching, View Materialization, and Data Warehouses 

At the end of this section, we would like to comment on the kinds of data that can be cached and replicated. So far, 
we assumed that only base data can be cached and replicated (i.e., base tables or indices or parts of them). We 
now turn to systems that cache or replicate (i.e., materialize) derived data or views. Such systems could, for 
example, cache the average salary of all Emps that work in a research department instead of or in addition to the 
complete salary information of all Emps. 

View caching and materialization has been addressed in a number of research projects (e.g., Roussopoulos et al. 
[1995], Keller and Basu [1994], Daret al. [1996]; Deshpande et al. [1998], and Dessloch et al. [1998]). View 
materialization has also been implemented in Oracle 8 [Bello et al. 1998]. Data warehouses are the most prominent 
example of commercial systems that materialize and/or cache views [Widom 1995]. Data warehouses are typically 
established for decision support in companies or as product catalogs and classified ads for electronic commerce on 
the Web. They are usually installed in a three-tier environment. The data warehouse is located in the middle tier, it 
is connected to one or more data sources, and it keeps materialized views over the base data stored at those data 
sources in order to answer queries from clients without interacting with the data sources. In fact, a huge industry 
has already been formed around this concept, and data warehousing definitely deserves more attention than we 
give it in this small section. From our narrow perspective, a data warehouse, the data sources, and the clients are 
part of a distributed system in which views are materialized or cached in the warehouse. 

Compared to the replication and caching of base data, the benefits of materializing and caching views are 
significantly larger. Caching the result of a join or aggregate query, for example, might completely eliminate the cost 
of join or group-by processing for subsequent queries in addition to savings in communication costs and potential 
load balancing effects. View caching and view materialization, however, are significantly more complex to 
implement. First, keeping cached or materialized views consistent in the presence of updates is complex and often 
expensive [Roussopoulos 1991 ; Quass and Widom 1997], and it is unclear how invalidation- based protocols, 
which have proven to be very useful to implement cache consistency, can be applied to view caching. Second, the 
ADR algorithm obviously cannot be applied to decide what views to materialize, and algorithms that carry out such 
decisions are just beginning to emerge [Harinarayan et al. 1996; Scheuermann et al. 1996; Yang et al. 1997]. 
Cache investment can be used, but there is an explosion in the number of "what-if" analyses that need to be carried 
out for every query so that a naive application of cache investment is impractical. Third, query optimization is more 
complicated and more expensive in the presence of cached and/or materialized views [Levy 1999]. The optimizer 
must determine whether a cached or materialized view is applicable-this is known as the containment or 
subsumption problem [Levy 1999]. After that, the optimizer must decide which of the applicable views to use. To 
this end, the optimizer must be extended in order to enumerate read(view) plans for all applicable views just like 
other access and join plans and carry out cost-based optimization using dynamic programming or iterative dynamic 
programming (Section 2.2.1). If, for example, a materialized view involves Tables Emp and Dept and was shown to 
be applicable for a query that involves the Emp, Dept, and Division tables, the view can be used as an access plan 
for the Emp table, as an access plan for the Dept table, and as a Emp Dept join plan. In other words, the view, if it 
is applicable, can be seen as a component database that stores copies of the Emp and Dept tables and is capable 
of processing joins, and query optimization in the presence of views can be carried out in the same way as query 
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optimization in the presence of heterogeneous component databases as described in Section 4.2. 
6. NEW ARCHITECTURES FOR DISTRIBUTED QUERY PROCESSING 

The previous sections presented a comprehensive set of techniques to implement distributed database and 
information systems. While this set of techniques is sufficient for most of today's applications, the advent of the 
Internet has sparked a large number of new applications and led to systems with an ever growing number of clients 
and servers. In such an environment, the conventional query processing approach presented in the previous 
sections might be too rigid. In this section, we will describe recent trends and developments. Specifically, we will 
give a brief overview of economic models for distributed query processing and dissemination-based information 
systems. 

6.1 Economic Models for Distributed Query Processing A large variety of economic models for various aspects of 
distributed computing have been studied since the mid-1980s (e.g., economic models for resource allocation, load 
balancing, flow control, and quality of service). A good survey of such techniques can be found in Ferguson et al. 
[1996]. The motivation to use an economic model is that distributed systems are too complex to be controlled by a 
single centralized component with a universal cost model. Systems based on an economic model rely on the 
"magic of capitalism." Every server that offers a service (data, CPU cycles, etc.) tries to maximize its own profit by 
selling its services to clients. The hope is that the specific needs of all the individual clients are best met if all 
servers act this way. 

Mariposa is the first distributed database system based on an economic paradigm [Stonebraker et al. 1996]. 
Mariposa processes queries by carrying out auctions. In such an auction, every server can bid to execute parts of a 
query, and clients pay for the execution of their queries. More precisely, query processing in Mariposa works as 
follows (more details can be found in Stonebraker et al. [1996]): 1 . Queries originate at clients, and clients allocate 
a budget to every query. The budget of a query depends on the importance of the query and how long the client is 
willing to wait for the answer. A client in Las Vegas could, for example, be willing to pay $5.00 if the client gets the 
latest World Cup football results within a second, but only 10 cents if the delivery of the results takes one minute. 

2. Every query is processed by a broker. The broker parses the query and generates a plan that specifies the join 
order and join methods. For this purpose, the broker may employ an ordinary query optimizer for a centralized 
database system based on, for example, dynamic programming. 

3. The broker starts an auction. As part of this auction, every server that stores copies of parts of the queried data 
or is willing to execute one or several of the operators specified in the broker's plan is asked to give bids in the form 
of (Operator o, Price p, Running Time r, Expiration Date x) In other words, with such a bid a server indicates that it 
will be willing to execute Operator o for p dollars in t seconds, and that this offer is valid until the expiration date x. 

4. The broker collects all bids and makes contracts with servers to execute the queries. Doing so, the broker tries to 
maximize its own profit. If, for example, the broker finds a way to execute the Las Vegas query from above in a 
second, paying only $1 .00 to servers, the broker will pursue this way and keep $4.00 of the budget as profit. If the 
query cannot be evaluated with acceptable cost in one second, the broker will try to find a very cheap way to 
execute the query in a minute and keep a couple of cents as profit. If the broker finds no way to execute the query 
within the time/ budget limitations, the broker will reject the query. In this case, the client must raise the budget, 
revise the response time goals, or just be happy without the answer. 

At first glance, Mariposa's query processing approach does not appear to be very different from the techniques 
presented in Sections 2. and 3.. Mariposa carries out two-step optimization as described in Section 3.3.3, making it 
possible to avoid heavily loaded or slow servers. The beauty of Mariposa is that different servers can flexibly 
establish different bidding strategies in order to achieve high revenue. For instance, a server might specialize in 
high-end or low-end services. Using an example from real life, there are expensive restaurants for people that like 
to eat well and fast-food restaurants for people with other needs. This diversity makes it possible to meet the eating 
habits of a large group of people. Mariposa supports such a diversity in the services provided by a distributed 
database system. 

Another advantage of Mariposa is that dynamic data placement fits nicely into Mariposa's economic approach. In 
addition to the revenue for executing query operators, servers can make a profit by buying and selling copies of 
data [Sidell et al. 1996]. The soccer WWW server located in Paris, for example, was not able to handle all the 
requests from all over the world during the World Cup finals in 1998. Using Mariposa, that server could have 
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allowed other servers, say, in Brazil or Nigeria to replicate the results of the soccer matches and have gotten 
additional revenue for selling the original copy of the Results table and for propagating all the updates. Servers in 
Brazil and Nigeria would have bought copies of the Results table to bid for queries that involve that data and/or sell 
copies of that data to other servers (e.g., in Argentina or Cameroun). 

While all these concepts sound very promising and a version of Mariposa is already available commercially 
(distributed by Cohera), it is still unclear how well Mariposa and other systems with economic models will work in 
practice. There is a significant amount of research required to find out how to configure the bidding and data 
buying/selling strategies of servers and how to keep the overheads of the bidding protocols within reasonable limits. 
6.2 Dissemination-Based Information Systems Throughout this paper, a request-driven data delivery model was 
assumed. In this model, users or application programs (i.e., clients) are active and initiate queries; servers are 
passive and process queries upon request. Lately, there has been a great deal of interest in push technology. In 
this model, servers are active and disseminate data to clients before the clients ask for the data. An early 
incarnation of push is TeleText, provided by most European TV stations since the mid1980s. Furthermore, both 
Netscape's Navigator and Microsoft's Internet Explorer provide features to allow clients to passively listen to data 
that is disseminated by WWW servers. PointCast's Screensaver, which displays news and commercials based on a 
user's profile of interests, is another product in this domain. A good overview of these and other push-based 
systems is given in Franklin and Zdonik [1998]. One reason for this interest is that many people like to obtain all the 
information they are interested in with virtually no effort. In addition, there are a number of technical reasons in 
favor of push-in particular, if data is dissiminated in networks that support broadcasts or PN multicasts. Most 
importantly, push-based systems scale better than traditional request-driven systems. Rather than processing 
every request individually, push-- based systems satisfy the requests of several users by disseminating the results 
only once [Aksoy and Franklin 1998]. Data push and request-driven access to data can also be combined in order 
to achieve high scalability and satisfy unusual user requests at the same time [Acharya et al. 1997]. Other 
interesting aspects are client-side caching in push-based systems [Acharya et al. 1995; 1996], and multitier 
architectures for data dissemination [Franklin and Zdonik 1998]. 

Unfortunately, SQL-style query processing has not yet been studied in the context of push-based systems. For 
example, it is still unclear which of the techniques presented in this paper would be applicable for a push-based 
system. A great deal of future work remains to be done in this area. 

CONCLUSION In the last decade, the landscape of distributed database and information systems has changed 
tremendously. Network technology has become mature, and as a result, businesses rely more and more on 
distributed and on-line data processing architectures as opposed to monolithic and batch-oriented architectures. In 
addition, a whole new generation of distributed database applications is appearing, exploiting, for example, the 
Internet or wireless communication networks for mobile clients. Furthermore, most systems today have a client- 
server or a multitier architecture, and many complex systems are composed of several subsystems from potentially 
different vendors with heterogeneous data processing capabilities and APIs. In this paper an overview of the state 
of the art in distributed query processing was given. This paper discussed various query processing techniques 
developed for recent products and research prototypes and showed how they can be applied to different types of 
distributed systems. Many different architectures and applications can be found today, but all these architectures 
can roughly be characterized by their communication paths (client-server, peer-to-peer, or multitier) and by the 
capabilities of the sites of the system (homogeneous or heterogeneous). For each category, the paper presented 
and discussed that set of query processing techniques which are particularly effective. For instance, the paper 
showed how to exploit client resources in a strict client-server system and how to exploit the query capabilities of 
individual sites in a heterogeneous system. 

Independent of the specific architecture, all distributed database and information systems today are based on the 
following two principles: Best effort: the query processor always tries to execute a query as fast as possible or with 
as little cost as possible. At the heart of this strategy is a query optimizer, which decides for every query which 
query execution methods to use (e.g., which join method), where to execute these methods, and in which order to 
execute these methods. The optimizer can be used statically in order to compile a query once and for all times. The 
optimizer can also be used dynamically just before a query instance is executed or on the fly while the query is 
executed in order to adjust to the current state of the system. 

Flexible data placement: in order to improve the performance of a whole query workload, caching and/or replication 
can be used in order to place data at or near sites where the data are frequently used. Both query optimization and 
caching/ replication are extensively studied in this paper. 

Combined, the set of techniques presented in this paper should be sufficient to support most of today's database 
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applications. Nevertheless, there are a number of avenues for future work: No vendor has implemented all or even 
a significant portion of the techniques described in this paper. Conceptually, the pieces fit together well, but it is 
nevertheless not always easy to integrate a new technique into an existing system. For example, it is possible to 
extend a query optimizer to consider a new query evaluation algorithm, but doing so might substantially increase 
the running time of the optimizer. As a result, a tricky compromise must be found that extends the optimizer so that 
the new algorithm is supported reasonably well and the increase in optimization time is tolerable. 

The techniques described in this paper can be implemented as part of a distributed database management system 
or as part of a database application system. Preferably, of course, the techniques should be implemented as part of 
a database management system so that any kind of application can directly benefit from them. In fact, however, 
several of the techniques presented in this paper have been implemented as part of the SAP R/3 business 
application system [Kemper et al. 1998] because standard, off-the-shelf database management systems have not 
yet implemented these techniques. This situation might be the cause for a great deal of confusion, and ultimately 
certain application systems might not work well with certain database management systems if conflicting 
techniques are carried out on both ends or important techniques are not carried out at all. Coordinating all the 
different query processing activities is a difficult task in such systems. The situation is getting worse with the current 
trend to design and market application and database management modules that can be freely plugged together 
and may interact in unpredictable ways. 

Scalability remains a major concern. It is still unclear whether query optimization and the "best effort" approach 
work in a system with ten thousands of servers and millions of clients because so far nobody has been able to 
simulate query processing in such a large scale. In addition, further research is necessary in order to find out how 
well economic models and data dissemination models work for large-scale query processing. 

Most of this paper was focused on structured (i.e., relational) data. There is still a great deal of work to be done in 
order to integrate other types of data (e.g., XML, text, images, etc.). Furthermore, it is important to deal with 
approximate or partial answers on the Internet; on the Internet, failure is the rule rather than the exception. 
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[Footnote] 

1 A scoring function f is defined as monotonic if sl(a) <sl(b) A s2(a) <s2(b) implies that f(sl(a), s2(a)) < f (sub), s2(b)). 
[Footnote] 

2 Caching in the granularity of individual tuples, for example, has been studied in Kemper and Kossmann [1994]. 

[Footnote] 

Ow * fi *: 

[Footnote] 

4 Sometimes, the terms federated or multidatabase system are used in the same way as we use the term heterogeneous 
database system. 

[Footnote] 

5 Other annotations that may be used by rules include the tables, columns, and predicates involved in a subplan or the 
sorting order in which the toplevel operator produces its output [Haas et al. 1997; Lohman 1998]. 

6 Precisely, the wrapper would construct the SQL query taking into account the table, column, predicate, and sorting order 
annotation of the root operator of the plan. 
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[Footnote] 

7 In a multitier environment, caching can be established at all tiers. Caching data at several levels (i.e., hierarchical 
caching), is also carried as part of the Internet. Furthermore, many institutions provide proxy caches in order to serve a 
group of clients [Luotonen and Altis 1994]. 

[Footnote] 

Note that WWW browsers like Netscape cache data on a client's disk, and disk caching has also been shown to be useful 
in the general database context [Franklin et al. 1993]. 

[Reference] 
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